Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum
نویسندگان
چکیده
We have developed an algorithm called Q5 for probabilistic classification of healthy versus disease whole serum samples using mass spectrometry. The algorithm employs principal components analysis (PCA) followed by linear discriminant analysis (LDA) on whole spectrum surface-enhanced laser desorption/ionization time of flight (SELDI-TOF) mass spectrometry (MS) data and is demonstrated on four real datasets from complete, complex SELDI spectra of human blood serum. Q5 is a closed-form, exact solution to the problem of classification of complete mass spectra of a complex protein mixture. Q5 employs a probabilistic classification algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally efficient; it is noniterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of different training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classification method achieves excellent performance. We achieve sensitivity, specificity, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques and can provide clues as to the molecular identities of differentially expressed proteins and peptides.
منابع مشابه
Probabilistic Disease Classi cation of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum
We have developed an algorithm called Q5 for probabilistic classi cation of healthy versus disease whole serum samples using mass spectrometry. The algorithm employs principal components analysis (PCA) followed by linear discriminant analysis (LDA) on whole spectrum surface-enhanced laser desorption/ionization time of ight (SELDI-TOF) mass spectrometry (MS) data and is demonstrated on four r...
متن کاملProteomic Analysis of Gene Expression in Basal Cell Carcinoma
Background: Basal Cell Carcinoma (BCC) is a type of non-melanoma skin cancer. Alteration in gene expression is the important event that happens in cancer cell. Detection of this event is possible by proteomics techniques. Methods: Normal and tumor tissues were taken from BCC patient. Total proteins were purified by standard methods, and proteins were separated by two-dimensional electrophoresis...
متن کاملA New Hybrid Feature Subset Selection Algorithm for the Analysis of Ovarian Cancer Data Using Laser Mass Spectrum
Introduction: Amajor problem in the treatment of cancer is the lack of an appropriate method for the early diagnosis of the disease. The chemical reaction within an organ may be reflected in the form of proteomic patterns in the serum, sputum, or urine. Laser mass spectrometry is a valuable tool for extracting the proteomic patterns from biological samples. A major challenge in extracting such ...
متن کاملAlpha-1 antitrypsin, retinol binding protein and keratin 10 alterations in patients with psoriasis vulgaris, a proteomic approach
Objective(s):Psoriasis is an autoimmune disease that appears on the skin. Although psoriasis is clinically and histologically well characterized, its pathogenesis is unknown in detail. The aims of this study were to evaluate the proteome of psoriatic patients' sera and to compare them with those of normal healthy human to find valuable biomarkers. Materials and Methods: In a case-control study,...
متن کاملHigh Throughput Quantitative Analysis of Serum Proteins Using Glycopeptide Capture and Liquid Chromatography Mass Spectrometry*□S
It is expected that the composition of the serum proteome can provide valuable information about the state of the human body in health and disease and that this information can be extracted via quantitative proteomic measurements. Suitable proteomic techniques need to be sensitive, reproducible, and robust to detect potential biomarkers below the level of highly expressed proteins, generate dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of computational biology : a journal of computational molecular cell biology
دوره 10 6 شماره
صفحات -
تاریخ انتشار 2003